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ABSTBACT , - 1 , . ' 

The ua$ of ''tests in personnel' decisicniB has becoa^ an 
increasing legal liability for eaployers* If he aajpr questions r^ied 
l>y« the courts>oncernihg' this use of testa are deiscribed. ^Currei^^C 
federal guidelines for perforaance apfxaisal systeas, as cstablisihed 
by the Equal Eaployaent Opportunity Ccaaissicn, are explained and 

^ traced to Title VII of the 1964- Civil Bights Act. The legal < ; 
iaplications of priaa facie discriaination*land the assessient '^of ^ 
adverse iapact upon* aincrities is explained. The process of jfudicial 
validation of performance appraisal systeas is discussed, inciluding 
specific case exaaples'' and flov charts. In deciding aheth^r<4^e 
criteria used in personnel decisions are valid and nondiscriiiinvitory,. 
the courts have utilized predictive, concurrent and content Validity, 
evaluations. The legal preferences and probleas associated ^th each 
type of^ vallrdity are described., The-^role of selection rati%i&v adverse 
iapact, Vfnd business necessity In judicial validity de9isicns is. 
discussed. The courts are concerned adjth the statistical correlation 
between te$t results and the criterion aeasures^of job perforaance, 

,and it is felt that, the. conflicting definitions of test validity and 
fairneds provided by indu^rial psychologists have caused probleas in 
the courts. The social controversy surrounding civil rights and 
eaployaent testing is also discussed briefly. (Jfuthot/JAC) 
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Title VII Of the 1964 Civil Rtghts Act required tliat/To employer * 
dlscrinlnate against any Individual on the basis ofj;«e, color^ creed, 
^ex, or national origin. Sin^ the inactment oj/^his legislation some^l2 
years ago, personnel practices of both publ><faocl private emplo^« hav^- ' 
-been subject to severe scrutiny. Iho-r^^is of well over^lOy legal cases 
has clearly der.criStrated that persp^t^ practices long a^jSi^ed as routin, 
end essential, can becon^e. significant legal liabilities. The issues 




Involved hav3 been emotlona^y charged and the stalces have been 



*For exan^le» oV?r a pecl^ji of rcore then a- yew^ during 1975-7r Chicago wai 



without 75 million Xo 9S nilUon dollars ^^irevenue , shaping l^unds. A.;! 
federal ju6q0^h^(f^d(ire^ the funds Intpounded and declar^d4he city's police 
officer/s exra^sergeant^s exam,"^ sergeant*s perfortiTance ratings discri- 
r^lnatoo^/^S^^ blacks and ch^xanos in vi^lation^/^ TitU VII of the pi'vil 
(SlOt^i^t^f^. ,the c1^ was eJuoincd from/furthe/^^se of the tests-or ratings 



j^Niamlf6r^^«;rei to hire and/(iroinote minorities arid vtonen In accordance with 
:ojj^fiiinposed qjijotas/iintil such tl'^ nondiscriminatory testing could be 
/established - ,{U^ v. City. of' Chicago" , . et al January S, 'l976,\Kerrorandum 
Decision), ' / . / 



X paper ifresented on the syr.;?os1un/"^*erfArmaTice: Appraisal and Feedback: Flies 
^1n theyOintRenf, Division 14^ Annual Keating of the AInerlcah Psychological 
Asssr:>4t1on, Washington, D. C. y^eptenber 5» 1976. .An earlier version of 
^ paper was presented a^ t^ Conference on Performance Appraisal 



Center 



ve Leadership* Crep\sboro» N.C*» Janua^c^ *d76. 



On January )8rri973 the U.S. Dlitrict CourVfor the Eastern. District of 
Pennsylvan^ approved a conseij;fe^ecVee' between AT & the Equal Employrant 
Opporturfvty Commission, ^m^^he Depajftr.eRt of Ubo'r. .The cor^any agreed td* 
c^ensate women an^/ifiinority employees with payments which were estimated t^ 
run between 12 aifd 13 million dollars/ Th^e payments were Intended as retro- 
active conp^sation to thos^e who ip the past may have been victims of discr1< 
minatjdh iM^romotion,, transfers, and salary administration (Miner, 1974). 

addKlon AT I T agreed to implement personnel practices which woul4 
apffleve a balance between t^ie proportions of women and minorities in its 
various occupation^ and proportions in the relevant labor force. . The/l^ter 



type of agreement 'is referred to in various government mercoranda 



"goals and timetables as distinguished from the Court-.lmposetl 

. V 

fled in the Chicago dec1siX)n. A further note about the K 

* ' ■ ^ • / 

M that the company agreed that results of future tes)/u>g of minority appll 



olgntan 
consent dscre^ 



cants could not be used as a justification for falTu/i to achieve' the, goal 
of proportional representation. 

In light of such outcomes of lltigatiafiybne can understand v;hy en:ployej 

are discontinuing the use of testing ajvd^rformance aj5pra1 sals' for hiring 

/ / * * i^i 

promotion. They-have assessed the >inpertaint1es of^the current situation 'Sn: 

decided that whatever gain in opgyrtzational efficiency these personnel 

.practices provide is not wdrjkfiyfht legal risks Involved. The Inevitable . 

result has been movement Xpyard random hiring and prcmdtion. Rov.*evcr, the 

, ' ^ ' / ' . y ^ ^ 

long tertn effects of 1^3r<)f efficient when valid employer assessment pro 
cedures are dlsconmaad ic also auhigh price to pay."* Thus," the employer 

' PERMISSION TO REW^WlcE THIS / . ' * - ,^ 

M/ft-ERIAL HAS BEE«X?RANTED BY /' * * *' ^ 



TO THE EOl^A^ONAL' RESOURCES 
INF<JFfMAl;O^ENTER (ERIO/nD 
USEF?S OBH-)^ ERIC SYSTEM "/ 
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who is faced vdth weighing the legal risks .of keepinjg.a performance appraisal 
system versus the\long term inefficiency risks of dropping it needs- to be 
able to properly ^s$e5& the ^strengths and weaknesses in their system. Jhe 
purpose of the present paper Is to outline the major legal questions courts 
have raised concerning the use of tests and other criteria in personnel 
decision -making. The paper will be pri^narlly concerned .with th^ legal^ 
questions raised ^concerning appraisal systems.' AlsOy-tlfespnc^P^ judicial 
valfdatioh will be defined. By judicial vaTidat^tff? I mean\he dl^slon 
propess which has developed in the cotirts iQjm}uaite whether criteria us'ed 
.to'jnake personnel decisions. are v^d or/fnval1d» nondiscriminatory or 
d1scr1m1natory> legal- or llleg 

One might question th^isdom and utility of conceptualizing yet 
another typ.e of validit^ What with content; cons-truct, con^arrent, .pre- ' 
dlctive, facei and^*^thet1c validities already In the psychologist's lexon, 
-who needs anoth^^ I would argue that judicial validity has mJsjj to 
reconwend ItsfeU not the least of which is that it is the one that coynts 
In the real/world- 

Origins of Current Federal Guidelines 

// - ' ^ • 

In understanding the legal requirements .Imposed on "performance appraisal 
/syst^pns It is helpful to consider current requirements In historical per- . 
spectlve. Title VII of the 1964 Civil Rights Act addresses, in very general 
terms, the nofion that Jt Is unlanftil for employers to discriminate on. the 
basis of race, color, religion., sex," and national origin. In the course of 
congressional debate the Tower Amendment was added ^o the bill in aa attempt 



ERjg . 
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to clarffy the intent of the Act., , Briefly the Tower Amendment said the ^ 

Act did not preclude an' eir.i>loyer's lise of and^actlng on any *professi anally 

developed test, provided such test is not designed. Intended, or used^to 

discriminate because of race, color, religion, sex or national origin 

Originally Senator Tower formulated the "Amendment to forestall the 

intrusion of the Federal '^Government into managerpent's right to prescribe 

employment qualification (Ash, 1966). In retrospect, the Amendment has not 

had such art effect and was ifi fact. the main stimulus fop current 

federal guidelines 4^/h1ch detail the characteristics of ^ceptabel personnel 

decision systems. ' > 

In 1966, ti>e Equal Employment Opportunity Ckiwnlsslon, as the administrative^* 

agency charged with implementing the provisj^of the Civil Rights Acl, 1ssue^4 

its" first set of guidelines, t1tled,!fGu)del1nes on Employment Testing Pro-^' 

cedures." It is cleai^ from the fir^ paragraph in the 1966 EEOC Gu1del1r>4s 

that they were fofmulated^o interpret the language-^n the Tower Amend^^nt. 

Two major characteristics ^f the 1966 guidelines were the adoption o^ th^ 

APA Standards for Educational and Psychological Tests and f'^anuals (f96B) 

»d the requirement oTf'separate criterion-related validity studies ftr minolrity 
, ^ , ^/^ . ^ / / ' 

and major! ty. grou^. This latter requirement involves checking for dlffer- 

entlal.validlty/whith wilV*be explained In more detail Jater on In the^paper. 

In 1970'^ the EEOC pu.bl»ls>iqd a revised version of the guidelines/ In the 

Federal Register. ^ The t1tl| 3n the revision Is "Guidelines on EmcHoyee 

Selection ProcedflTVes,", The^^word testing in the title of the 19g& guidelines 

had been charfged to selection vhich ^gests the exparrded scope of the 1970 

, guidelines. Of major 'Concern for the. present discussion >^s/the comprehensive 

definition of theiworrd "test** in the 1970 guidelines: / 



For the purpose of the guide! ijies In this part, the torn *tcst\ 
1$- defined as* any paper^nnd iK^ncIl or pgrfonnance wj'asure used as'' ^ 
ri basts' for any eniployncnt declsion.^.^*' The term 'tost* Includes ^ 
all fonnal sco<-ed, quantified or ^tan^aJ^dized techniques of Qssesslngy 
job suitability.. .(1607.2). • - , \ 

* ; * > * . ' \ 

. la a recent issue of The Conference *Soard Records (L«er. 197.6)' 
offered a tentat'lvc conclusions that this ^efinitifif'covefs . ^, ' ^ ^ 
perfonnanco appraisals. , It 1s clear from the scope of this definition and , 
subsequent Interpretations by tbe courts that use of perfprrnance appraisal/ 
system for er-plo^nxint decisions cor^s urfdcr Uiese guidelines. It is the'' 
interpi-etatlon by the courts ^of the 1970 ECOC guidelines which -provi da the cu 
Icjal definition of rtondlscrimina'tory personnel practices. 

■ . . . . ' ■ ^ .' 

EEOC (hiidelincs * " .... 

/ 0 ' . ' ' ' 

In attdnpting to understand the, EEOC guidelines one ntust rcpxjiiiber they 

were origin'ally fonailartcd to cover the use of paper and pencil ability 
■* * ' / ' 

.tests or "professionally developed tests'* in the words of the Tower Anicndnx:nt. 
In the proccfSt EEOC his adopted ^s' a mininSfbi standard, the standard for- - 
test validity 'set forth' by the An«Sr1can Psychological Association, There 
are c»iny*wha are of,the opinion that applying standards cleyeloped for 
CcltnvircjaTly produced written tests to systeus of asscsslix^jy Vike pcrfor- 
awince appraisals makes It extren)e>y difftcult'or iHtpossiblc to defond 
ipbrfoiTT^nco appraisals^ The .lack of*sucee$s eniploycrs have had in defending 

- perfi3rtnAnce appraisals in the courts scoh)s to^sul^^^Kftiate this* conclusion. 
I.vfould hasten to add, hW^er, that the porfoHWce appralsnl systems 
'challqinged In court caitMo date have not been sterling examples of sound 



-.6' ' . 

personnel practice, (cf Holley and Field, 1975 for a review of several 
ca^s involving performance appraisals). This fact has made defense of good 
appraisal systems even more difficult. The requircirents for dononstratiiig 
thit an appraisal sysfcm is nondiscriminatory have become more complex and , 
more stringent. Further, the courts have developeij a deep scepticism about ' 
any jsscssaicnt technique involving supervisor;^ judgments. In fact, thi 
District. Court in the Chicago' Pql ice .De;\artment case in J^ary of this 
year concluded without qualification "that supervisory ratings are not a 
fair nMJasurcment of 'an eniployce.*s sui'tabiUty for prornotion " v. 
Ci ty of Chicago , 8 CDP 9785, 1974). The court further interprets ^th^ 
tastliiiony of'defendents* and plaintiffs' export witnesses, both well" known > 
Indtstrial psychologists ^ as being Iji agreement with this conclusionj With 
sLcl: ariw^uallfled negation of the usofulncss of Supervisor ratings, It Is 
little woj'dcr that defense of any performance appraisal system, tio matter 
luw thoroughly developed, is an uphill battle. 

' ftost of .the court decisions to date have not had as their major.focus, 
the question of the validity of performance appraisal^. However, supervisor 
rctings have been a favorHe criterion in predictive and concurrent ' * 
vclidatlon studies. From the courts critiques of such studies one can garner 
a great deal of information on how they view supervisor's ratlogs. Fo.' example, 
the Suprenjc Court in Albcmnrlo v. Mnod ^ound«tho validation studies con- 
diclcd by the^cmployer inaterlally dofcctivc in^part because "Albemarle's 
supervisors wore asked to rank employees by a 'standard* that was^ extntmely 
vegte and fatally open to divergent Interpretation..." Lower courts h.ive 
iwnifestcd a'-ntore general suspicion of supervisor ratings as exemplified b)^the above 
quote from the Chicago Police Department case. 



What then are the legal requirenents^for a perfonnance appraisal system 
to be norldfscrjrainatory? The definition of discrimination given in the EEOC 
guidelines was. endorsed by the Supreme tourt in the 1971 Griggs vs Duke Power 
case as "the ad-ainistrative Interpretation of the (Civil Rights) Act by the 
enforcing *agency" and consequently entitled to "great deference," The, 
following Is* the complete text of th^ EEOC definition of discriminatory 
use of "tests": 

X Tiie use of ciny test which adversely affects hiring, [jromotion, 
transfer or any other emplcyrnent or meniib^rship ot)portunity of classes 
protected by ti^ld VIl constitutes discrimination unless; (a) the 
^test has been yalidated "and eyidences^ a- high degr^er of utility as 
'hereinafter described, and (b) the per^n giving or acting upon the 
results of the particular test cani deiponstrate that alternative- 
suitable hiring, transfer or pror:iotion fxrxicfidures are unavailable for 

his use. (1607.3). 

t - " \ 

In the decision xif the Supreme-Court In Albermarle vs Moody in June 1975, 
the court reaffirmed Its*" endorsement of part (a) of the EEOC definition 
■but inodifled part (b).itit Is now legally- ihe burden of th^ complaining 
party to make a shovi/ing that other procedures for hiring, transfer or" 
promotion are available. 

« 

The first part of the definition Involves what the courts have come to 
define as a prima facie case of discrimination. Specifically, performance 
appraisals are primf facie discriminatory if their use in personn^el decision 
making results in hiring, piromotion, transfer, or layoffs in a racial pattern 
^significantly different from the pool of ''.appli^cants. - The burden of proving 
a prima facie case of (liscrimination lies legally with the com'plaining party. 
If plaintiffs can demonstrate that d6*cisions based upon performance appraisals 
have an advers.e Impact on minorities, the burden of proof for establishing 
the validity of the performance appraisals is shifted to the employer. Thus, 



« 

the process of proving a charge of discrimination involves two steps: first, 
the plainti/fs muit establish a prima facie case of discrimination involving 
^ adverse impact on minorities;^ second, the employer must fail to demonstrate 

_a re lationship between perfonnance appraisal scores and perfonnance on the 
job. While the first step in tKe process is the legal burden of complaining 
parties, it behaves employers to make careful assessment of any adverse, 

^ impact on minorities decisions based upon performance appraisals are having. 

Assessing Adverse Impact ^ . ' ' , ' 

The means by which one assesses the adverse Impact of performance 
appraisals dependSvj^p the nature of the personnel decision it supports. If 
the decision is dichotomous, such as promote or not promote, retain or lay- 
off, then a direct* statistical' comparison of proportions of minority and 
majority applicants assigned to the same status' should be mide. If t*te ^ 
performance appraisal results in assigning employees to categories such as . , 
more than acceptable, acceptable, questionable, and not a'cceptable, then 
statistical comparisons of the frequencies of minorities and nonminorities 1r 
each category should be made. If a numerical score is assigned to individuals 
\ such as 1n\the use of summated ratings or behaviorally anchored scales, 
then the averages for minority and nonminorlty groups should be statistically 
compared. In each of these comparisons, if the difference^ observed are 
likely to occur less than once in twenty tines by chance alone, the courts 
*are certain to consider this clear evidence of adverse impact^.., 

While statist'ically significant differences' between performance- of 
minority and nonminorlty groups is sufficient to estab'lish a prim.a facie 
case,* it is not always necessary. Lower courts, in interpreting the 

• . 9 



Siy/retne Courtis position on what constitutes a, prima facie case, have 

approved other ways to establish such a showing.. Extreme under-represeo- 

tatlon of minorities in varfous eschelons of a promotional* structuresinay' 

establish a prima facie case without reference to applicant pools. ^ h 

Disparities. between proportions of minorlt'les employed^by a company 

compareij^Sfirihe general p^opulatlo'n or d similarly situated work forpe m^* 

also establish a prima facie case. Finally, apparent discrepancies between 

the performance ratings of minority and nonmlnorltles nbt statistically 

significant may still be Interpreted l5y the courts a^s adverse impact. The 

Chicaco police case citetl echrlIeK Involved efficieiicy ratings with one " 

* 

ppint difference in means of 85.2, for whites and 84.3 for blacks. The 
court Considered the difference to^be^signlflcapt because 92% of all patrol- 
men scored between 80 and 95 on the measure. 

Clearly, the first major question an employer should address concerns 
the relative effects of performaace appraisals on minorities and fton- , v 
minorities* If scores produced by such appraisals oV'declslons based upon - 
these scores do not adversely/effect minorities, the appraisals are by 
definition nondiscriminatory. All other things being equal, the performance 
appraisal systen which minimi zes_d1fferences between minorities and non^ . 
minorities has the least legal liability under Title VII as interpreted 
by the ££0C qufdellnes . If performance appraisals do not have adverse 
impact, «,the employer has no legal burden of proving the appraisal scores 
are related to job performance. If there is adverse impact, the perfor- 
roance awaisals are prima facie discriminatory and the employer must 
present empirical evidence to the courts proving the appraisals are yalld. 

.10 • 
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/Before turning to the requirements for^ proving ^validity, I, would note 
that the employer is 'responsible to keep accurate records on the re$ults 
of employment decisions for each protected group under Title VII. At one 
time It was not considered proper or even legal to keep minority Identifi- 
cation in the personnel records. Under Title VII requirements such information 
Is essential. ^ 

Establishing Validity 

The 1970 fEOO guidelines specify^ the following requirement for establish- 

ing the validity of a '"test": 

Evidence of a test's validity should consist of empirical data . 
demonstrating that the test Is' predictive of or significantly 
correlated with Important elements. of work behavior which comprise 
or are relevant to the job or jobs for which the candidates are 
being evaluated (1607.4(c)) ^ " 

As noted earlier, the 1966 EEOC guidelines were patterned after the A?A ' . 

Standards applicable to written tests used mainly in hiring new employees.^ 

The 1970 gul declines were extended to apply to virtually eveif7 criteria which 

could be used as a basis In making personnel decisions. The Extension came 

.solely ir>rthe broader definition of a test cited earlier; The requirements ^ 

for'evldence of validity,, which constitute the bulk of thfe 1970 version, 

endorse the jconcept of cr1;^pri on-related validity developed fdf acievemetrt or * 

ability tests. 

Criterion-related validity involves the process of correlating individual 
test scSflBS to independent measures of actual job performance. There are ^ 
two types^^criterion-related validity, predictive -and concurrent , fredictiye 
validity is longitudinal in dfeslgn. Applicants are tested be-fore being hired * 
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and then followed up> after'a period of timevOn the job, wi*th measures 

of job performance. Concurrent validity is cross-sectional in design. • 

Job incumbants -are given the test and the job performance measures at 
abou^^the same time.. In both cases, correlations between jtest scores and 

performance scores^nstitute the validity evidence. * * 

Courts have shown 'a"'(ftsti net preference fer predictive over concurrent 

validity; This is priWily ^ec^Jse a p.osl'tlon, .admission to which is " 

contingent upon a test which has'^verse ^imoacf. Is not likely to have ^ 

nany minorities as J^b Incun^ants^ 'Since incumbantj constitute the sample 

of a concurrent study,' 'no stable estimate of the relationship between test * 



scores and job*perfor»»nce measures for minorities canibe made. The courts 
have been' const sti^nt In rejecting evidence of test* validity based upon 
npnmlnorlty sanqiles as evidence that the test is valid for mioorities. 
^,The guidelines require .that "cjata must be ggge rated and results separately < 
•^/reported for raitiorlty and nonmlnority^ groups " (1607.5 J^)' ^5))., 
^ ^ TJie requirement that employers muaS^heck fofl 'possible differential- 
validity Is- thoroughly /I mbe'dded irt the case law of Title VII litigation, 
^.Pijotestatlorfs by promsion^l psychologists that differential validity 
"doesn't exist ^re not likely to remove ft in the near future. One of the 
H main controversies surrounding the currently proposed 'uniform .guide I'inte 
of the Equal Employment Opportunity Coordinating Council (EEOCC) is 
whether or not the requirement ^to check for differential validity should 
be retained. . • . ' ' 

< • the guidelines, while definitely endorsing criterion related v?ilidity, 
are ambivalent concerning th^ alternative approach to A/alidity called content 
va\\1tyi The t^est understanding of content validity can be obtained frpm . 
' the definition given It by the courts: ^ 

Rir 



/ ' For « test, to be content valid, the knowledge, skills 
and aptituCdes-" required fpr ^uccessufl examination performance 
must be the kno^edge, skills and aptitudes vecjuired fpr 
successful job perfocmance (US vs nty of Chicago, 8 EPO, ► . • 
9785) ^ . ^ 

The definition given in the APA 1974 standard states: * 

To deironstrate consent validity of a set of test scores, 
^ one- must show that the behaviors demonstrated in testing^ 
constitute a representative sample of behaviors to be ex- 
.hibited in a desired pej'formance domain (p. 28)., 

The ambivalence of the EEOC guidelinesAon Content validity is reflected 

in the following two statements from section 1607.5 (a): 

^'Evidence of tontent validity. may also be, appropriate 
-where criterion-related validity is not feasible." 

. "Evidence of content-validity^ alone may be acceptable for 
well developed tests -that consist of suitable samples of the , 
essential knowledge, skills or behavior comparing the job in 
question." " ' . 

•The first statement says content validity is definitely second best. 

The second statement provides a qualified endorsement of content validity. 

The courts have extrapolated EEOQ ambivalence to an' implicit but. clear 

hierarchy of validity approaches: ^ , 

Predictive Validity^ most p:^ferred 

* Concurrent Validity - second best 

4 

" Content Validit/^ least preferred 
The courts have identified content validity as "a form of* 'rational 
validity*, rather than, empirical' validity. The (content validity) 
analysis depends appreciably on opinions of psychologists " (US vs City * 
of Chicago, 8 EPD, 9785). 
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It Is wy view that this^distinction between rational and empirical validity 
Is Illusory, ^more semarttic than reaf. In fact, as oae examines the text of 
court opinions in Title VII cases, it is clear that a comprehensive process, 
"rational validi-ty" .has Jt>een developed by the judicial system. This process 

•is what I have called "judicial validation." A limitation on the usefulness 
of the concept of judicial v^3^*^^ty is that, to date, courts have had a great 
deal more to say about what is not acce'ptable than what is acceptable ^ ^ 
employinent practice. Perhaps it would better be labeled "jiidiclal invalidity." 
Whatever tferm one uses, I mean to refer to the process by which the courts 
have systematically examined an employment system, critiqued <>^mpir1cal 
evidence, and weighed the evidence of expert witnesses in arriving at«a decision 
that a test Is invalid or valid. Of specific interest for the presept 

.discission is the process used by the courts to evaluate performance ratings. 

I have formulated a decision flow chart as a means of summarizing my 
conception of judicial va\idity. I would like to outline by means of the* - ' 
flow chart, the fact6rs which have been e.-nphasized by courts in 
critically examining performance appraisals. The' critical element throughout 
the chart 1s*V, the judicial validity coefficient. It*1s a qualitative 
Index which Is gsed to illustrate the relative cohtrib.ut1on of each factor 
to the overall decisions. The weightings involved are based upon rny Intutltlve 
estimate of relative contribution and are presented as a didactic tool. No 
precision 1$ implied In the numbers used. However, the relative sizes of 
the various terras are meant to reflect ny perceptions of the relative 
merits each step has in litigation. 
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he Judicial Va.tidation Process 




What has been AO-vered ut? tO/^ow has been t^he c^ujd^lines fprWlated as 
the Supreme Court* said in fa-igqs to be th^ adminis'^^tive interpretation 
if the Civil Rt^ts Act. ^lowever, ultimatfe inte^:;iretation of legislation 
judij^al .^ponsibility. . ^Judici^'l interpre^atjons are trigones that 



's a 



(ount ,in,the rrfal world. Wh;it follbwS i^s ny cpnception of the court's 




•.nterpretatl9_g*bf^Title VII to date. I would Emphasize the qualification 
t o giate . Jd^iiial inferpretations are d^aifti^c and thus a description at 
<ny point i,rrt1me*can quitkly become out ilat/ed. , , 

Furthej, I would cautjon that I have attempted to describe an historical 
process not a normative one. In many instances n\y perception of what is and 

. ' ' ' ^ { 

vhat ought to be are quite divergent, I do not in this context discuss the 

latter.'\' / / " ^ 

ThroughOitt Ihe flow chart which I have used to define judicial validation (JV). 
/var1ai)le/tV) Is used as a crude qyantifl cation of the outcome. Its purpose 
Is primirily heuristic. Also^ all decision points have been arbitrarily 
tedi^ce/^to a small'number of discrete alternatives. Undoubtably, some of 



1h§ parameters in\^plved are continuous (eg. adverse Impact statistics) but 
, s^uchnr^priesentatlbn would Involve unnecessary complexity. , ^ 

I' The first apd crudal point of JV is, tile evaluation of adverse impact 
cf/ftie personnel decision the basis for which is some appraisal system. 

Should be, emphasized* that decisions have adverse Impact, tests do not. 
iMterIa used by the decision maker become the subject of JV only to the 
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Extent that the»decisions adversely affept a pif^otected groupv Thus It Is 
possible to employ "tests" to inake personnel deci^loas and avoid their le^ 

' liability by monitoring their Impact. Giving up testing or stopping app^ansals 
is not the only .way to avoid lltlgatlofi. Crudely putMf your'hunijers clbme 
out right, you fiave nothing to worry about, tf the deciiflons haye no^ averse 
impact^ by definition the criteria used are nondlscHmlnatory. . ^ 

If there Is adverse Inipact, the 'outcome of JV will, be \ fuhc^tion of Its 
degree* This contingency is not discussed In any -of the so, called guidelines 
but it Is a legal /reality. The Supreme Court In Albemarle v. K§ody reflected 

.the contingency when it ruled that "...there simply' was no way to determine/ 
whether the criteria actually considered were sufficiently related to the ' 
company's legitimate interest in job specific ability to justify a testing 
system with a racially discrimihatory Impact" (p* 305) (emphasis, added). ^ Further, 
.it Is njy personal opinion that the recent ruling of the C6urf In Washington v Davis 
was due In part to th^ mild adverse Impact of Test 21. 

In making the outcome 'Of JV contingent Irt part upon the degree of 

adverse lir^iact, the courts have included social (Utilities in the validation 
process; 'Recent writers such as Peterson and Novick (1976) have proposed the inclu- 
sion- of such utilities in psychometrl* models, but we are far from .a con- 
census. ' ^ * ' ' . ' '( 
The'^second- decision point 1s to decide If the adverse Impact of tlje , 

'appraisal sy.sterft is 'due to seniority. For "the first several years in a 
new position, appraisal scores are likely to be correlated v-zith seniority^. 

Jit may be that minorities have less senlority^and thus lower Scores due to- 
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past discrimination. The courts have held that such a situation Is the 
present effect of a past practice of discrimination and have ordered the. 
appi^ralsals dropped or altered to /educe the adverse impact (cf. Harper vs 
Mayor and City Councll'^of Ba*lt1more ) . - ^ ' . . * ' ^ 

In the ne^t step of JV we find one of the pivotal Issues bifi which' 
discrimination cases turn— the. job aYialysis*. Is th^ appraisal system 
based upon. a conoprehensive job analysis? Did the employer attempt to* / 
systematically identify the essential knowledge, skills, and^tfhaviors , 
ccppojsing the pos1ti(>n, being appraised? The burden .of proving a job * 
analysis was sufficiently comprehensive is difficult topedTt.- There Is 
at present no definltlon^of what constitutes a "comprehensive job 
analysis.." The JV process highlights three key elements: (1) pirsons 
carrying out the analysis should be jdb experts; <(2) a team approach using 
independent judigements is most desirable and {Z) both frequency and importance 
of elements sholild be ixientified.' Varying quality of the job. analysis wilf 
have^ varying effect on V. ^ 

If there has been n6 job analysis of substance in the- formulating of 
the performance apprai^l a decision point Is reached. If the emp*l oyer will 
Incur no monetary liability in the form of back pay for past use of the 
appraisals, then they should be discontinued and a new appraisal system 
formulated based upon proper job analysis. If potential back pay awards are 
largfe, then It may be necessary to move ahead and try to prove they were 
valid* The probability of establishing the ratings as' valid given moderate 
to severe adverse Impact would be low (V » -7 qir -10 ^t this point)* 
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An interesting legal question on job analyses remains lately un- 
answeriB\ Suppose a defendent in a TitU VII litigation involving a 
performance appraisal system presented a job analysis done post hoc . 
Further, suppose thai the job analysis so done supported the apprai^sal 
'system. All the guidelines currently endorsed say the job analysis 
m%t precede test development. Clearly this requirement is based on the 
belief that a thorough job analysis Is more likely to result in valid- 
test. But the guidelines say nothing about the evidensary value of a 
post- hoc job analysis* A job' analysis is not a sufficient condition j 
for a test to^be valid. The question raised here is "Is it necessary?" given the 
the above suppositions? ' ' * . 

The next step in JV i-nvolves a consideration of the performance appraisal 
process. .The model Indicates that standardized situations Vjre preferable 
to on W job ratings by supervisors'. Further, if the ^performance is reliably 
measured by actual o^utput then with almost any type of job analysis the 
system would be considered valid. An example of such a test for machinists- 
was .reported by Schmidt et al (1975). ' ^ . 
i The next (lec1sion"^int involves an analysis of the structure of the ' ^ 

rating form iised by the intervlev/er or supervisor; Ratings of behaviors 
. are considered a plus and rating of traits considered a minus, due to"th| 
varying degree of .subjectivity and level of inferenw involved in the 
rater*s judgement. If the^ behaviors are evaluated using behaViorally 
ahchq^d scales: (Campbell', ^t al, 1973) then V is further enhanced. | ^' 
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The next section indicates an aspect of JV that is unique tp perfor- 
mance appraisal systems. Since appraisals invoVve evaluation of one 
person by another, the rater must coi|e under scrutiny. The first qufestix)n 
asks how many raters are "vwolve^/ Courts view supervisor judgements as 
inherently subjective and regard them with,_gre^t suspicion If they support 
detisions with adverse impact.' In- human judgements', there Is, to some , 
extent, objectivity' in numbers. The extent of training raters receive 
c^h adversely affect V.if none 1^ given or enhance V if the training is 
thorough ^and interfater agreement e'stablished at some acceptable level., . 
Finally, the rater's qualifications will influence the outcome of JV. 
Both experience ^on the job being evaluated and experii&nce as a rater can 
'enhanc'e-the rater*s credibility. 

Sometimes raters and work sample .simulations are corrblned as in the 
study of telephone operators by Gail et al (1975) or the st^idy of fire- 
fighters we conducted 1n^Baltimore^ (L1.vingston, et al 197i)l' 

The next step asks what type of validity is claime'd for the appraisal 
s^^stems. Here the tettms used In the guidelines come into play. The main 
thrust of the flow-chart is that a number of key rational questions 
relevant to the final decision on validity have been raised before the 
empirical data are consulted. It is In the context of these rational 
questioris that the weight to be given the empirical evidence is deten^lned. 

If axriterign related validity study 1s available the questions 
raised In this section Involve technical adequacy of the methodology. — 
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At this point faniillar-psychojnfitric issiies are Involved. .Key. points • ^ 

^ of concern to the courts have been the naturt^of the sample. As noted 
' earlier separate validity studies for minority- groups arc definitely 
preferred. -Abst-r.ce of a separate study fSr minorities is not a liability 
If the e^^loyer can convince the court such ai study was not technically 

^ lit 

feasible/ in the presence of Dodcrate of severe adverse impact, it will 

be 4ifficu,lt tb make such a showing. 

• The inain statistical outputs of a criterion-relatcd^alidity study 

&re r^a/is, stind^rd deviation, correlation coefficients and regression 

equations* The details of whstt constitutes a nondiscrimj/wtory test in 

terms of these various statistics are extremely technical as illustrated 

by *the' spl^uii^ssue of the Journal of Educati^al Mtasurerocnt last spi-ino (1976). 

To compound, r.atters, thert is considerable differences of; opinion amony 

psychologists on the technical definition of a "fair test." It is little wonder 

i that-^ftei^-^ court has heard two psychologists express three opinions, they 

view such "expert opinion" with a jaundiced eye. The following quote from 

the judge in U S A v. City of Chicane 'reflects this cynicism; 

The defendants have chosen to lead the court *dccp into the 
* jargcn of psychblogical testing.' The result has been a virtual morass of 
ccrnpeting theories ad/t:ftced by professional testers, and tests in 
v;hich the debate has centered on prt-dictive,^ concurrent, criterion and 
construct valiclatibn and the court has been left with the unwelccmed 
task of testing the't^sters. It is net amiss to observe that plain- v. ^ 
tiffs have 'not shunned the debate. ,(8 £?D 978?) 

Son-^ professionals (for-example Sharf, 1975) believe the problem can 

be eli.'ai«ated by educating the public and the courts. Others f!|el 

psychologists need training in giving testimony as experts, f do not feel 

either approach gets at the heart of the fwroblem. Perhaps the problem is 

best $frfnari3fed ifl^^the. statement by Pogo "We have met the enen^...and they- 

.is Us." * - ^ • . ' ' 
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'l'' believe the currej^tate of affairs in the litigation of testing 
is the reslilt of eager cndorseixnt of psych'oir^tric ideals by many industrial 
psychologists who welcomed the judicial review of Title VIII litication a^^ 
a r^an^ to improve testirffci pj-actites in busi'ne#and government. They 
^ trie^to use the legal ;>ftoces5 to force- changes not easily brought about 

professionally. But-after S £o 10 years of seeing these^sychcmotric 
ideals appliec^-via an advocacy process to inferior testing programs many 

. are fearing a monster has, been created. A number of the professionals 
Involved e^rly in the process are finding it difficult to back away froD ' 
the characteriza-jj^on current leg^il advocates have'giyen to their initial * 
professional oiJinions.* -Oifferc^lal validity ^is but one exarrplc of a • 
concept which enjoyed "wide professional endorsement initia>ly but has 
fallert inta disrepute. ' '-^t^e theories supporting differential validity 
Still p>ay a large role in litigation. Yet, considering the sctencific 
evidence supporting the concept I believe it fair to ^y that if 
"differential validity? were a test it would be ei^joined fro.-n further 
use. In s^ort, we have promised the courts, with our high sounjding jargon 
and our sophisticated mathibatics, m.ore certainty than we ian deliverand 

< - » * 

are faying the toll for overselling the. product. But this i5 off the 
subject of how courts h»v^ actually viewed statistics which is the primary 
concern of the presen^t diseussion. 

The main concern of thlJ courts to i^te has been with both the statistical 
and practical significancejof the^^yeffdity coefficient— the correlation 
between the "test" being val^ulated and the criterion measure ,of job 
performance. Statistical signimancd of .a correlation coefficient '^Is a 
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precisely defined entity. It Is .<ie\erm1n^ by con^pafing the obtained 
"Coefficient with a critical value obtained from a tublc In the back of any 



"statistics beck, any pers^on capable of using the logical operators > 

"Sreater tharv" or "less then" can determine If a correlation coefficient 

is statistically significant at so::« specified level of probability. 

(The EECC guidelines specify a less than 1 In 20 probability of chance 

occurrence as smistlcally significant)/ Most modern Statistical 

% 

corputer prcgra.'ns provide the signiflcahpe level to four decimal -places^n 

' the output. It is only in the significant a level of the corr^T^itlon 

1 ^ 

qpefflclent that tKe validity evidence is "empirical" rather than 
•rational.** It is at this point that'th^lllusion of inathematical 
Objectivity Is so misleading. The. previous steps In the' flow chart 
eniphisize the Iripoctence of nonquantltatlve or "rational" judgenients 
In the total process. An ^ir.por^anr article entitled "TraiKby Mathe- 
jnatlcs: Precision and Ritual in the Legal Process" (Trlbe^. 1971) Is 
helpful In gaining a broader perspective on hoW the apparent elegance and 
cbj^ctlvlty^of r.athecat1cal evidence can distort; the judicial process. 
In the case of Jitle VII litigation* Pearson product moment correlations 
and regression equations have become ti^e mathexatlcal "taif^ wagging the 
'.judicial "dog," 

Tne overarching importance of rational Judgement even aj^ this most 
entpirical step of thj^rJJfocess I have caMed judicial vaH'datlon is reflected 
In the concept of '^practical signfflcane" articulat'gdl5yrthe EEOC guidelines 
■ in section 1607.5 (c) (2). Brie'fl/, practical signlflcane i^ a function 
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to selection ratios (proportion of -ajjpV cants actually hired, promoted, 
or laid oiff), success ratios ^(proportion of applicants successful on Job 

J a 2. 

without using the test), and bgslness necessity (econor.i1c or human risk 
; factors). Of these three factolfs, selection ratios and business 
necessity have been the ones Involved In th^ judicial process. I have 
fflmiulated a three dimensional table Intended to characterize the 
reUtive influence of various combinations .of these factors on V. A 
situation involving high business necessity, lo\v» selection i»at1o$, and a 
criterion-predictor correlation greater than .30*1s the strongest 
empirical evidence for^the validity of the test. 

However, assessing practical utility Is not the.erd. The statistical 
evidence Is wci^ihed In terms of the criterion or criteria used in the study. V 
'It Is at th1$ point that we see the illusion Of empirical objectivity most 
clearly. Because the criteria in the. validity study 
-.are themselves subjebt-tQthe scrutiny of judicial validity. ^At this poflit 
the flow-chart loops back to s^ep 2 where the job-re^Urtcness of the criterion is 
examined. As, Iting as criterion-related data ar^^^sented, the looping process 
will go on; the criterion measure is always subject to rational serai tiny. 
A point must be reached in^the decision process wtje re a criferion'is 
evaluated on Its own merit rationally if a decision is to be made. This 
involves a judgemenf of Its content validl^. Thus, wnile the EEOC guidelines 
and courts explicitly' endorse empirical validity, both logically and 
realistically, rational- analysis Is th^ ove)erching, mo.re pervasive 
characteristic of judicial validity. .What is*portrayed in the flow-chart 
as judicial validity is similar the model of "procedjral job relatedriess"* 
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presented^by Kohls (1975). Kohls technicfue exphaslzes, consideration of 
adverse lirpact.^fib analysis, use of job experts, and job perfonnance of 



selected Individuals. Judicial validity as prescntcd^herc is DroadSr fft 

• scope/ - • , , . ^ ' - ^ ' . 

Since. the court must at sorne point make a judgment concerning conte{it^ 
' validity v:c need to look at-the-k<y element of content^valldlty. The 
^ntral eler^-nt cf all available content valldUy deflnitloas Is the m 
re^ulrcn-ent tha^ the "test" sanple essential bphavlbrs. The-only type of-, 
ireasurc that satisfies th-ls requlreiri-nt unambiguously ,1s a job performance 
sfsnjlatloni, SuCh a iwjasuVe- has what Guion has labeled "operational 
validity" (Gutcn, 1*975). The 'behaviors operationaVlzed in simulation * 



IshOtfld be the cjjitlcal behaviors rcqu1,rod. for job success, 
. ^ Job sicvlatlons are preferrable tb actual on the job behaviors In 
several resjJects*. The' standardization of task demands -allov/s^ for contr61 
ofiexosenous influnces on ratee performance. Quantif^cati^of performance 
cante objectified even If raters are Involved, thr9ugh training and use 
of niuUlple raters. On-the-job ratings by a single supcrviisor, even when. 
behav1oral)y anchored. scales are ernployed, is suspect in court due to the 
-inherent** subjectivity of a personal judgement. Reccf^t research by Gael, 
Gr;ant» i Ritchie (W5a, 1975b) present two job performance measures which 
possdss a high-level of cperattonal validity. Even though'these measures ^ 
have wrked adverse 1r.fact (iSinorities perfonred significantly lower than 
nonmHnVitles an all dimensions of both job si<nulations) it Ms iry judgement 
they yi^puld be found job-related within the judicia'i validation process, 
.th^t^the authors had only used beh^lvlorly^nchored scales to obtain 
supervisor ratings of on-the-job performance. Extrapolating from the 
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simulatioa rtsuTts reported-by the -authors, the supervisor ratings would aTso«' 
evidence adverse ioipact. Givem^he extent of adverse in:pact and court 
jkept1cfs?n pf "subjective" supervisor ratings, the ^^Jjus o^ the ratingij^^ 



within judicial validity would be questionable. The relationship betwe^fc 

' - . ^ • '-^ 

the behavioral ly anchofed supervisor ratijvgs of_on-the-job performance, and 

job-simulation performancej' would need to be demonstrate , ' - * 

Social' Utility' , . • ' ^ - 1 ^ 

. .' ^ " . ^ • ' ' 

Civil rights arwd employment testing InvoTfes sociiil^emotionaJ al^I " * 

philosophical as-wellas the job related Issues. To those Individuals ^ 

are especially sen^tiveto these "bther issues rry pr^sertation may .appear* 

crassly pragmatic. If t>vis Is so, it is because I haVt limited try ^$enV 

tatlon to Issues explicitly addressed In the- federal gui^elines^r the , < 



There are social questions raised by Tit^e VII which affecU-our-'Ifidividual 

* ' ■ . * 

view points. Critic? of employment testing are conrutted to increasing 

minority participation in all 1c\^^ls of the work force. 'Many endorse ' . 

' ' - • ' 

preferential trcatmcnfrr of minorities to accot:;plish thlY goal "stating 
"race conscious 0v11s Require race conscious rervidies The recent Supreme 
Court decision to dodge this issue in the DeFunis reverse di^criiffrrT^^tion 
case Involving a University of Washington law student reflects our societa^ 
reluctande to grapple with the issue of social utilitios explicitly. at any . " 
policy level. , 

I would argue that the judicial validity rcdel reflects the-jcourts 
position on at least three factors influencing sociaj utility. These-/ 
factors are adverse impact, selection ratfos, business necessity* The . 




m 



-adverse, iisp^crTact6rMs loa^ in favor of mjnoritias* The courts have 



ftdde, validity. rcquircncnU jJorQ-strinVjont as a /unction of adverse impact. 
The;other'two factorjs atteflipt to consider utility from the employers point 
. / . of view. Wh^t is'^^lt'reflected ip the model is the social utility of the • 
,inatvidiia|^ judges. - ^yonQ^o has been 1nvqlv(jd in. several court cases u 
- Struck by th& wido vitiation i/f behavior of judges. ~% own inference is that 
thevifldgos* behlvior ift handHngJUle VII cas6s Is sig;iificantly influenced 



■^^ . bi their persorraV social utilities. The appelate system of-courts is supposec 
' j • .-^ Offset Vjudge's personal biases^, . But^the first^ju(p^'at this district 
^ ' • level has a dreat deal of leeway •in tf^al^procedui^lajejtill avoiding 
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reversible ercor., JucHcii? irtflity Is jp" unknown factor^ontil the case is 
•«ctually assigned %q a Judge, but it wlH hive arf infkencp. 
. ■ »V pfersonal optniofj-is that thd\odal utilltiioS which-have evolved ' 
frss thS'judici*! pr^«tss is noted SarHer ^re'jjpason.-ible^ur.eherjcre, 
' th? judtcial Syst*n is^^theonly^odiaJ policy "cce in oufcurtent ■ ' 
^systeawJtich could'syslcnatically explore such % Qontro^vtrtial are^ 
WcttercAy the power of special interest wifklj distort our cultural ^ 
•values and iBcaSilize our Jlftfll si atfvp sysfem.; »^ t- 
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